Metarule-guided association rule mining for program understanding
نویسندگان
چکیده
Software systems are expected to change over their lifetime in order to remain useful. Understanding a software system that has undergone changes is often difficult due to unavailability of up-to-date documentation. Under these circumstances, source code is the only reliable means of information regarding the system. In this paper, we apply data mining, or more specifically, association rule mining, to the problem of software understanding i.e. given the source files of a software system, we use association rule mining to gain insight about the software. Our purpose is to explore the use of association rule mining for finding interesting associations within the software that can lead to program understanding. To make association rule mining more effective, we place constraints on the mining process in the form of metarules. Metarule-guided mining is carried out to find associations which can be used to identify recurring problems within software systems. We relate metarules to re-engineering patterns which present solutions to these problems. We apply association rule mining to five legacy systems and present results which show how extracted association rules can be helpful in analyzing the structure of a software system and in suggesting modifications to improve the structure. A comparison of the results obtained for the five systems also reveals legacy system characteristics, which can lead to understanding the nature of open source legacy software and its evolution. Index Terms —Data Mining, Association Rule Mining, Re-engineering Patterns, Program Understanding
منابع مشابه
Using Data Cubes for Metarule-Guided Mining of Multi-Dimensional Association Rules
Metarule-guided mining is an interactive approach to data mining, where users probe the data under analysis by specifying hypotheses in the form of metarules, or pattern templates. Previous methods for metarule-guided mining of association rules have primarily used a transac-tion/relation table-based structure. Such approaches require costly, multiple scans of the data in order to nd all the la...
متن کاملMetarule-Guided Mining of Multi-Dimensional Association Rules Using Data Cubes
In this paper, we employ a novel approach to metarule-guided, multi-dimensional association rule mining which explores a data cube structure. We propose algorithms for metarule-guided mining: given a metarule containing p predicates, we compare mining on an n-dimensional (n-D) cube structure (where p < n) with mining on smaller multiple pdimensional cubes. In addition, we propose an efficient m...
متن کاملMeta-Rule-Guided Mining of Association Rules in Relational Databases
A metarule guided data mining approach is proposed and studied which applies metarules as a guidance at nding multiple-level association rules in large relational databases. A metarule is a rule template in the form of \P1 ^ ^ Pm ! Q1^ ^Qn", in which some of the predicates (and/or their variables) in the antecedent and/or consequent of the metarule could be instantiated. The rule template is us...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEE Proceedings - Software
دوره 152 شماره
صفحات -
تاریخ انتشار 2005